About Me

I am a fifth-year Ph.D. candidate in Computer Science at Georgia Institute of Technology, working at SSLAB coadvised by Taesoo Kim and Anand Iyer. Before Georgia Tech, I graduated from The Chinese University of Hong Kong with a Bachelor's degree in Computer Science. My research interests include systems for deep graph learning and general machine learning. I am currectly exploring systems aspects of training/serving dynamic GNNs and GraphLLMs.


Selected Projects

Model Training Provenance with Confidential Computing

While large-scale training has enabled models with unprecedented capabilities, it also introduces significant risks: from misuse in malicious settings to regulatory non-compliance, e.g., autonomous vulnerability exploitation. We are currently developing such verifiable ML training pipelines through the application of Confident Computing (CC). CC provides a hardware-based secure method of verifying the training process, protecting both the model provider’s and data provider’s sensitive data while allowing third-party verifiers to audit the training process.

March. 2024-Present


High-throughput Inference for Early-exit LLMs

Machine learning inference platforms continue to face high request rates and strict latency constraints. Existing solutions largely focus on compressing models to substantially lower compute costs with mild accuracy degradations. We explores an alternate (but complementary) technique that trades off accuracy and resource costs on a per-input granularity: early exit models, which selectively allow certain inputs to exit a model from an intermediate layer. We present the first system that makes early exit models practical for realistic inference deployments. Our key insight is to split and replicate blocks of layers in models in a manner that maintains a constant batch size throughout execution, all the while accounting for resource requirements and communication overheads. Evaluations show that we accelerate goodput of early-exit model inference for autoregressive LLMs (2.8-3.8x) and compressed models (1.67x).

May. 2023-Sep. 2024


System for Dynamic Graph Neural Networks at Scale

Existing systems for processing static GNNs either do not support dynamic GNNs, or are inefficient in doing so. In this project, we are building a system that supports dynamic GNNs efficiently. Based on the observation that existing proposals for dynamic GNN architectures combine techniques for structural and temporal information encoding independently, we propose novel techniques that enable cross optimizations across various tasks such as traffic forecasting, anomaly detection, and epidemiological forecasting.

May. 2021-Dec. 2023


Publications and Preprints

  1. Improving DNN Inference Throughput Using Practical, Per-Input Compute Adaptation [To appear]

  2. Anand Iyer, Mingyu Guan, Yinwei Dai, Rui Pan, Swapnil Gandhi, Ravi Netravali
    In Proceedings of the 30th Symposium on Operating Systems Principles (SOSP)
    Austin, TX, USA, Nov 2024

  3. HetTree: Heterogeneous Tree Graph Neural Network [paper]

  4. Mingyu Guan, Jack W. Stokes, Qinlong Luo, Fuchen Liu, Purvanshi Mehta, Elnaz Nouri, Taesoo Kim
    arXiv preprint
    arXiv:2402.13496, Feb 2024

  5. DynaGraph: Dynamic Graph Neural Networks at Scale [paper]

  6. Mingyu Guan, Anand Padmanabha Iyer, Taesoo Kim
    In Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (GRADES-NDA)
    Philadelphia, PA, USA, June 2022


Services

  • Artifact Evaluation Committee, The 30th Symposium on Operating Systems Principles (SOSP '24).

  • External Review Committee, 2024 USENIX Annual Technical Conference (ATC '24).